学习曲线的元学习是机器学习社区中一个重要但经常被忽视的研究领域。我们介绍了一系列基于学习的基于学习的元学习挑战,其中代理商根据来自环境的学习曲线的反馈来寻找适合给定数据集的最佳算法。第一轮吸引了学术界和工业的参与者。本文分析了第一轮的结果(被WCCI 2022的竞争计划接受),以了解使元学习者成功从学习曲线学习的东西。通过从第一轮中学到的教训以及参与者的反馈,我们通过新的协议和新的元数据设计设计了第二轮挑战。我们的第二轮挑战在2022年Automl-Conf中被接受,目前正在进行中。
translated by 谷歌翻译
我们研究了Massart噪声存在下PAC学习半空间的复杂性。在这个问题中,我们得到了I.I.D.标记的示例$(\ mathbf {x},y)\ in \ mathbb {r}^n \ times \ {\ pm 1 \} $,其中$ \ mathbf {x} $的分布是任意的,标签$ y y y y y y。 $是$ f(\ mathbf {x})$的MassArt损坏,对于未知的半空间$ f:\ mathbb {r}^n \ to \ to \ {\ pm 1 \} $,带有翻转概率$ \ eta(\ eta)(\ eta) Mathbf {x})\ leq \ eta <1/2 $。学习者的目的是计算一个小于0-1误差的假设。我们的主要结果是该学习问题的第一个计算硬度结果。具体而言,假设学习错误(LWE)问题(LWE)问题的(被认为是广泛的)超指定时间硬度,我们表明,即使最佳,也没有多项式时间MassArt Halfspace学习者可以更好地达到错误的错误,即使是最佳0-1错误很小,即$ \ mathrm {opt} = 2^{ - \ log^{c}(n)} $对于任何通用常数$ c \ in(0,1)$。先前的工作在统计查询模型中提供了定性上类似的硬度证据。我们的计算硬度结果基本上可以解决Massart Halfspaces的多项式PAC可学习性,这表明对该问题的已知有效学习算法几乎是最好的。
translated by 谷歌翻译
计算机断层扫描(CTA)图像上的三维(3D)肾脏解析具有极大的临床意义。肾脏,肾肿瘤,肾静脉和肾动脉的自动分割在基于手术的肾癌治疗方面受益匪浅。在本文中,我们提出了一个新的NNHRA-UNET网络,并使用一个基于它的多阶段框架来细分肾脏的多结构并参加KIPA2022挑战。
translated by 谷歌翻译
已经证明对比学习是有效的,可以减轻医学图像分析中昂贵注释的高需求,这可以捕获图像中的一般图案,并且自然用作各种任务的初始特征提取器。最近的作品主要基于案例明智的歧视,并学习全球歧视特征;然而,他们不能帮助临床医生处理主要由局部相似性分类的微小解剖结构,病变和组织。在这项工作中,我们提出了一般无人监督的框架,以了解来自医学图像的局部歧视特征,以进行模型的初始化。在此事实之后,相同体区域的图像应该共享类似的解剖结构,并且相同结构的像素应该具有类似的语义模式,我们设计神经网络以构建具有相似上下文的像素的局部判别嵌入空间是聚类和异种像素的分散。该网络主要包含两个分支:嵌入分支以生成像素 - WISE Embeddings,以及聚类分支以将相同结构的像素聚集在一起并生成分段。提出了一种区域辨别损失以在互利模式中优化这两个分支,使得通过聚类分支集群聚集在一起的像素共享类似的嵌入式矢量,并且训练模型可以测量像素方面的相似性。当转移到下游任务时,基于我们框架的学习特征提取器显示出更好的泛化能力,这优于来自广泛的最先进的方法,并在彩色眼底和胸部X光中的所有12个下游任务中获胜11。此外,我们利用像素 - 方面的嵌入来测量区域相似度,并提出一种形状引导的跨模块分割框架和中心敏感的单次地标定位算法。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
Rankings are widely collected in various real-life scenarios, leading to the leakage of personal information such as users' preferences on videos or news. To protect rankings, existing works mainly develop privacy protection on a single ranking within a set of ranking or pairwise comparisons of a ranking under the $\epsilon$-differential privacy. This paper proposes a novel notion called $\epsilon$-ranking differential privacy for protecting ranks. We establish the connection between the Mallows model (Mallows, 1957) and the proposed $\epsilon$-ranking differential privacy. This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy. Theoretical results regarding the utility of synthetic rankings in the downstream tasks, including the inference attack and the personalized ranking tasks, are established. For the inference attack, we quantify how $\epsilon$ affects the estimation of the true ranking based on synthetic rankings. For the personalized ranking task, we consider varying privacy preferences among users and quantify how their privacy preferences affect the consistency in estimating the optimal ranking function. Extensive numerical experiments are carried out to verify the theoretical results and demonstrate the effectiveness of the proposed synthetic ranking algorithm.
translated by 谷歌翻译
Due to their ability to offer more comprehensive information than data from a single view, multi-view (multi-source, multi-modal, multi-perspective, etc.) data are being used more frequently in remote sensing tasks. However, as the number of views grows, the issue of data quality becomes more apparent, limiting the potential benefits of multi-view data. Although recent deep neural network (DNN) based models can learn the weight of data adaptively, a lack of research on explicitly quantifying the data quality of each view when fusing them renders these models inexplicable, performing unsatisfactorily and inflexible in downstream remote sensing tasks. To fill this gap, in this paper, evidential deep learning is introduced to the task of aerial-ground dual-view remote sensing scene classification to model the credibility of each view. Specifically, the theory of evidence is used to calculate an uncertainty value which describes the decision-making risk of each view. Based on this uncertainty, a novel decision-level fusion strategy is proposed to ensure that the view with lower risk obtains more weight, making the classification more credible. On two well-known, publicly available datasets of aerial-ground dual-view remote sensing images, the proposed approach achieves state-of-the-art results, demonstrating its effectiveness. The code and datasets of this article are available at the following address: https://github.com/gaopiaoliang/Evidential.
translated by 谷歌翻译
A noisy training set usually leads to the degradation of the generalization and robustness of neural networks. In this paper, we propose a novel theoretically guaranteed clean sample selection framework for learning with noisy labels. Specifically, we first present a Scalable Penalized Regression (SPR) method, to model the linear relation between network features and one-hot labels. In SPR, the clean data are identified by the zero mean-shift parameters solved in the regression model. We theoretically show that SPR can recover clean data under some conditions. Under general scenarios, the conditions may be no longer satisfied; and some noisy data are falsely selected as clean data. To solve this problem, we propose a data-adaptive method for Scalable Penalized Regression with Knockoff filters (Knockoffs-SPR), which is provable to control the False-Selection-Rate (FSR) in the selected clean data. To improve the efficiency, we further present a split algorithm that divides the whole training set into small pieces that can be solved in parallel to make the framework scalable to large datasets. While Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline, we further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data. Experimental results on several benchmark datasets and real-world noisy datasets show the effectiveness of our framework and validate the theoretical results of Knockoffs-SPR. Our code and pre-trained models will be released.
translated by 谷歌翻译
Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.
translated by 谷歌翻译